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Abstract: This paper considers adiabatic reduction in both discrete and 
continuous models of stochastic gene expression. In gene expression model, 
the concept of bursting is a production of several molecules simultaneously 
and is generally represented as a jump terms of random size. In a gen- 
eral two-dimensional birth and death discrete model, we prove that under 
specific assumptions and scaling (that are characteristics of the mRNA- 
protcin system) an adiabatic reduction leads to a one-dimensional discrete- 
state space model with bursting production. The burst term appears then 
through the reduction of the first variable. In a two-dimensional continu- 
ous model, we also prove that an adiabatic reduction can be performed in a 
stochastic slow/fast system. In this gene expression model, the production 
of mRNA (the fast variable) is assumed to be bursty and the production of 
protein (the slow variable) is linear as a function of mRNA. When the dy- 
namics of mRNA is assumed to be faster than the protein dynamics (due to 
a mRNA degradation rate larger than for the protein) we prove that, with 
the appropriate scaling, the bursting phenomena can be transmitted to the 
slow variable. We show that the reduced equation is either a stochastic dif- 
ferential equation with a jump Markov process or a deterministic ordinary 
differential equation depending on the scaling that is appropriate. 

These results are significant because adiabatic reduction techniques seem 
to have not been applied to a stochastic differential system containing a 
jump Markov process. Last but not least, for our particular system, the 
adiabatic reduction allows us to understand what are the necessary condi- 
tions for the bursting production-like of protein to occur. 
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92C40,60J25,60J75. 
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Introduction 

The adiabatic reduction techniques give results that allow to reduce the dimen- 
sion of a system and justify the use of an effective set of reduced equations 
in lieu of dealing with a full, higher dimensional model, if different time scales 
occur in the system. Adiabatic reduction results for deterministic systems of 
ordinary differential equations have been available since the very precise results 
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of Tikhonov (1952) and Fenichel (1979). The simplest results, in the hyperbolic 
case, give an effective construction of an uniformly asymptotically stable slow 
manifold (and hence a reduced equation) and prove the existence of an invariant 
manifold near the slow manifold, with (theoretically) any order of approxima- 
tion of this invariant manifold. Such precise and geometric results have been 
generalized to random systems of stochastic differential equation with Gaussian 
white noise (Bcrglund and Gentz (2006), see also Gardiner (1985) for previous 
work on the Fokker-Planck equation). However, to the best of our knowledge, 
analogous results for stochastic differential equations with jumps have not been 
obtained. 

The present paper gives a theoretical justification of an adiabatic reduction of 
a particular piecewise deterministic Markov process (Davis, 1984). The results 
we obtain do not give a bound on the error of the reduced system, but they 
do allow us to justify the use of a reduced system in the case of a piecewise 
deterministic Markov process. In fact, we prove limit theorems using martingale 
strategy, in a similar manner than in recent papers such as Crudu et al. (2012), 
Kang and Kurtz and Riedler, Thieullen and Wainrib (2012), where general con- 
vergence results for discrete models of stochastic reaction networks are given. 
In particular, these papers give alternative scaling of the traditional ordinary 
differential equation and the diffusion approximation depending on the different 
scaling chosen (see Ball et al. (2006) for some examples in a reaction network 
model). After the scaling, the limiting models can be deterministic (ordinary 
differential equation), stochastic (jump Markov process), or hybrid (piecewise 
deterministic process) . For illustrative and motivating examples given by a sim- 
ulation algorithm, see Haseltine and Rawlings (2002); Rao and Arkin (2003); 
Goutsias (2005). However, we emphasize that we do not consider here a continu- 
ous approximation of a discrete model. Rather, we perform adiabatic reduction 
on both discrete state-space and continuous state-space models. Time-scale re- 
duction have been considered in Kang and Kurtz, but not on the kind we perform 
here. 

Our particular model is meant to describe stochastic gene expression with 
explicit bursting (Friedman, Cai and Xie, 2006). In discrete state-space burst- 
ing models, the variables evolve under the action of a discrete birth and death 
process, interrupted by discrete positive jumps of random sizes. In continuous 
state-space bursting models, the variables evolve under the action of a contin- 
uous deterministic dynamical system, interrupted by positive jumps of random 
sizes. In both cases, the positive jumps model the burst production of several 
molecules instantaneously. In that sense, the convergence theorems we obtain 
in this paper can be seen as an example in which there is a reaction with 
size between and oo. We hope that the results here are generalizable to give 
insight into adiabatic reduction methods in more general stochastic hybrid sys- 
tems (Hespanha, 2006; Bujorianu and Lygeros, 2004). We note also that more 
geometrical approaches have been proposed to reduce the dimension of such 
systems in Bujorianu and Katocn (2008). 

Biologically, the bursting of mRNA or protein molecules is defined as the pro- 
duction of several molecules within a very short time, indistinguishable within 
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the time scale of the measurement. In the biological context of models of stochas- 
tic gene expression, explicit models of bursting mRNA and/or protein pro- 
duction have been analyzed recently, either using a discrete (Shahrezaei and 
Swain, 2008; Lei, 2009) or a continuous formalism (Friedman, Cai and Xie, 2006; 
Mackey, Tyran-Kamihska and Yvinec, 2011) as more and more experimental ev- 
idence from single-molecule visualization techniques has revealed the ubiquitous 
nature of this phenomenon (Ozbudak et al., 2002; Golding et al., 2005; Raj et al., 
2006; Elf, Li and Xie, 2007; Xie et al, 2008; Raj and van Oudenaarden, 2009; 
Suter et al., 2011). Traditional models of gene expression are composed of at 
least two variables (mRNA and protein, and sometimes the DNA state). The 
use of a reduced one-dimensional model (that has the advantage that it can 
be solved analytically) has been justified so far by an argument concerning the 
stationary distribution in Shahrezaei and Swain (2008). However, it is clear that 
two different models may have the same stationary distribution but very differ- 
ent behavior (continuous or discontinuous trajectories, monostable or bistable, 
etc; for an example in that context, see Mackey, Tyran-Kamihska and Yvinec 
(2011)). Hence, our results are of importance to rigorously prove the validity of 
using a reduced model. Our results are based on the standard assumption that 
the mRNA molecules have a shorter lifetime than the protein molecules, that 
is widely observed in both prokaryotes and eukaryotes (Schwanhausscr et al. 
(2011)). Depending on the assumed scaling of other kinetic parameters within 
the mRNA degradation rates, different limiting models are obtained. 

The paper is organized as follows. In the first section, we prove a reduction 
results for a discrete state-space model, that is a two-dimensional birth and 
death process. Assumptions on the birth and death rates are in agreement with 
a standard model of gene expression for the mRNA-protein system. That is 
both variables remain positive and birth of the second variable can occur only 
if the first variable is positive. Using an appropriate scaling of birth and death 
rates, we prove that this model converges to a general one-dimensional discrete 
bursting model. 

In the second section, we prove a reduction for a continuous state-space 
model, that is a two-dimensional piecewise deterministic model of gene expres- 
sion with a jump production term for the first variable. Using appropriate scaling 
on parameters, we prove that this model converge either to a deterministic or- 
dinary differential equation or to a one-dimensional continuous bursting model. 

1. A bursting model from a two-dimensional discrete model 

The fact that bursting models arise as a reduction procedure of a higher dimen- 
sional model was already observed in Shahrezaei and Swain (2008) and Crudu 
et al. (2012). In Shahrezaei and Swain (2008), the authors show that, within 
an appropriate scaling, the stationary distribution of a 2-dimensional discrete 
model converge to the stationary distribution of a 1-dimensional bursting model. 
The authors used analytic methods through the transport equation on the gen- 
erating function. Their result seems to be restricted to first-order kinetics. The 
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first variable is a fast variable that induces infrequent kicks to the second one. 
In Crudu et al. (2012), the authors show that, within an appropriate scaling, 
a fairly general discrete state space model with a binary variable converge to 
a bursting model with continuous state space. The authors obtained a conver- 
gence in law of the solution through Martingale techniques. The binary variable 
is a fast variable that, when switching in an "ON" state, induces kicks to the 
other variable. 

We present below analogous result of Crudu et al. (2012) when the fast vari- 
able is similar to the one of Shahrezaci and Swain (2008). Our limiting model is 
still a discrete state space model. These results are more precise than the one of 
Shahrezaei and Swain (2008), and more general (some kinetics rates can be non- 
linear). We use martingales techniques, with a proof that is similar to Crudu 
et al. (2012) and also inspired by results from Kang and Kurtz. We present be- 
low the model, then state our result in the subsection 1.1, and divide the proof 
in the three next subsections 1.2-1.4. 

We consider the following two-dimensional stochastic kinetic chemical reac- 



tion model 











> Ai, 


Production of X\ 


at rate Ai(X, X2) 




0, 


Destruction of X\ 


at rate 71 (Xi, X 2 ) 





^2(^1-^2) v 

— — y A 2 , 


Production of X 2 


at rate A 2 (Xi, X2) 


x 2 


^ Xl ' x *\ 0, 


Destruction of X2 


at rate 72 (Xi, X2) 



(1) 



with 71(0, X 2 ) = 72(^1,0) = to ensure positivity. This model can be rep- 
resented by a continuous time Markov chain in N 2 , and is then a general birth 
and death process in N 2 . It can be described by the following set of stochastic 
differential equations 



X 1 (t)=X 1 (0) + Y 1 (j X 1 {X 1 (s),X 2 {s))ds S j -Y a (J li(Xi(s), X 2 (s))<te), 
X 2 (t) = X 2 (0) + F 3 ( j X 2 (X 1 {s),X 2 (s))ds^ -F 4 ( / 72(A"i(s), X 2 (s))ds), 



where Yi, for i = 1...4 are independent standard poisson processes. The genera- 
tor of this process is given by 



lf(X 1 ,X 2 ) =X 1 (X 1 ,X 2 )[f(X 1 + 1,X 2 ) - f(X u X 2 ) 

+ 71 (Xi , X 2 ) [/ (Xi - 1, X 2 ) - /(Xi , X 2 ) 
+ A 2 (X 1 ,X 2 )[/(Xi ! X2 + l)-/(Xi,X 2 ) 
+ 72 (Xi,X 2 )[/(X 1 ,X 2 - 1) - /(Xi,X 2 ) 



(2) 



for every bounded function / on N 2 . 
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Exemple 1 . We have in mind the standard mRNA-Protcin system given by the 
following choice: 7j(Xi,X 2 ) = giXi with gi > for i = 1,2, Xi(Xi,X 2 ) = 
Ai(X 2 ) and A 2 (Xl,X 2 ) = k 2 X\ with k 2 > 0. Note however that even in the 
context of models of gene expression, different models have been proposed, that 
includes nonlinear feedback of mRNA and/or nonlinear degradation terms Bose 
and Ghosh (2012). 

1.1. Statement of the result 

We suppose the following scaling holds 

1 ?{X 1 ,X 2 )=N 11 {X 1 ,X 2 ) 
\%{X 1 ,X 2 )=N\ 2 {X 1 ,X 2 ) 

where N — > oo that is degradation of X\ and production of X 2 occurs at a 
faster time scale than the two other reactions. Then X\ is degraded very fast, 
and induces also as a very fast production of X 2 . The rescaled model is given 
by 



x» 



(t) =X 2 N (0) + y 3 ( J NX 2 (X^(s) 1 X 2 N ( S ))ds)-Y 4 ^ j ~f 2 (X»(s),] 



(3) 



(4) 



and the generator of this process is given by 

M N f(X 1 ,X 2 ) =Xi(X 1 ,X 2 )[f(X 1 + l,X 2 ) - f{X x ,X 2 ) 

+ iV 7l (X 1 ,X 2 )[/(X 1 - l,X 2 )-f(X 1 ,X 2 ) 
+ N\ 2 (X 1 ,X 2 )[f(X 1 ,X 2 + l)-f(X 1 ,X 2 ) 
+ 12(X U X 2 ) [f(X 1 ,X 2 -l)~ f(Xi,X 2 ) 

We can prove the following reduction holds: 
Theorem 1. We assume that 

1. The degradation function on X 2 satisfies 72(Xi,0) = 0. 

2. The degradation function on X\ satisfies 7i(0,X 2 ) = 0, and 

inf 7x{Xi,X 2 ) = 7 > 0. 

Xi>l,X 2 >0 - 



3. The production rate of X 2 satisfies \ 2 (0,X 2 ) = 0. 

4- The production rate function Ai and X 2 are linearly bounded by X\ + X 2 . 
5. Either Ai or \ 2 is bounded. 



R. Yvinecl / 'adiabatic reduction for PDMP 



6 



Let {X^ ,X^) the stochastic process whose generator is B^r (defined in eq. (i)). 
Assume that the initial vector (Xf (0), X£(0)) converges in distribution to (0, X(0)), 
as N — > oo. Then, for all T > 0, (X^(t), X^(t)) t >o converges in L 1 (0,T) (and 
in L p , 1 < p < oo) to (0,X(i)) where X(t) is the stochastic process whose 
generator is given by 

n OB <p(x) = \ 1 (o,x)(J p t ( 7l (i,.M-))(x)dt-cp(x)) 

+ 72 (0,X)[ V (X-l)-cp(X)], (5) 

where 

P t g(X) = E [g(Y(t, X)e~ £ 7i(i,r(».*))<fa] ^ 

and Y(t, X) is the stochastic process starting at X at t — whose generator is 
given by 

Ag(Y) = \ 2 (l,Y)(g(Y + l)-g(Y)). 

Remark 2. The first three hypotheses of theorem 1 are the main characteristics 
of the mRNA-protein system (see example 1). Basically, they impose that quan- 
tities remains non-negative, that the first variable has always the possibility to 
decrease to (no matter the value of the second variable) , and that the second 
variable cannot increase when the first variable is 0. Hence these three hypothe- 
ses will guarantee that (with our particular scaling) the first variable converges 
to 0, and will lead to an intermittent production of the second variable. The 
last two hypotheses arc more technical, and guarantee that the Markov chain is 
not explosive, and hence well defined for all t > 0, and that the limiting model 
is well defined too. 

Remark 3. The above expression eq. (5) is a generator of a bursting model for a 
"general bursting size distribution". For instance, for linear function 71 (Xi, X2) = 
giXi, and X 2 (Xi, X 2 ) = k 2 Xx, we have 



Pt(li(-M-))(P) = 9iPtiv)(p), 



e -sit 



. 9l e-^5>(z)P{lf = z}, 



It follows by integration integration by parts that 

/■OO 7 

/ p t {ii{.)ip{.)){y)dt = — x-^X^( z + ^H7H L_ 
which gives then an additive geometric burst size distribution of parameter 
P = k2+gi ' as ex P ec * e d Shahrezaei and Swain (2008). 
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We divide the proof in three steps: moment estimates, tightness and identi- 
fication of the limit. 

1.2. Moment estimates 

Because production rates are linearly bounded, it is straightforward that with 
f(Xi,X2) = X\ + X2 in eq. (4), there is a constant Cm (that depends on N 
and other parameters) such that 

M N f(X u X 2 ) <C N (X 1 +X 2 ). 

Then E[X^(t) + X? (t)] is bounded on any time interval [0,T] and 

f{x»{t),x?(t)) f(x?(p),x»(p)) f m N f(x»( s ), x?( s ))ds 

Jo 

is a ^-martingale. 

1.3. Tightness 

Clearly, from the stochastic differential equation on X± , we must have 

X^(t) — > 0. We can show in fact that the Lebesgue measure of the set 

{t < T : X^(t) 0} converges to 0. Indeed, taking f(X 1 ,X 2 ) = X x in eq. (4), we 

have 

X»{t) - X? (0) - f (Ai(^ (s),X»{s)) - N 7l (X»(s),X?( s )))ds (6) 
Jo 

is a martingale. Thanks to the lower bound assumption on 71, we have 

X E[ f l {X N {s) > 1} ds] <E [\(X?(8),X?(s))d8. 
Jo Jo 

Then, by the martingale property, we deduce from (6) 

2 NE[ [ l { x»( s )>i}ds] < E[Xf (0)] + f E[\i{X?(s), X? (s))}ds. (7) 
Jo Jo 

Now for X 2 we obtain from eq. (3), 

X?(t) < X 2 N (0) + Y 3 (J* Nl {x „ W > 1} A 2 (Xf (s),X? (s))ds) . 

Let us now distinguish between the two cases. 

• Suppose first that A2 is bounded (say by K). Then 

E[X^(t)] <E[X 2 »] +KNE[ [ l {x »(s)>i}ds]. 

Jo 
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As Ai is linearly bounded (say by K) by + X% , the upper bound 
eq. (7) becomes 

lNE[J l { x fW > 1} ds] <E[X?(0)]+kJ (e[X» (s)] +E[X? ( s )])ds. 

Finally, with eq. (6), it is clear that 

E[X?(t)] <E[X N (0)] +K J (e[X?(s)] +E[X?(s)]y s . 

Hence, with the three last inequalities, we can conclude by the Grdnwall 
lemma that E[X^(t)] is bounded on [0,T], uniformly in N. Then 

/ 1 {x-{s)>i}ds] 
Jo 

is bounded and X^ — > in L 1 ([0,T],N). By the law of large number, 
jjYs(N) is almost surely convergent, and hence almost surely bounded. 
We deduce then there exists a random variable C such that 

X^{t) < X?(p) + NC f l {x ?(.)>i}d8, 
Jo 

almost everywhere. By Gronwall lemma and Markov inequality 

P{ sup X?(t) > M } -> 
te[o,T] 

as M — > oo, uniformly in N. 
• Now suppose Ai is bounded (say K). By the martingale eq. (6) (and the 
same lower bound hypothesis on 71, it is clear that 

Jo 

is bounded and X? -> in L 1 ([0, T],N). Now, let us denote U N (t) = 
^X?(t), V N = jfX?(t) and W N = Nl {X N {t) > 1} (which is then bounded 
in L 1 ([0,T[)). From eq. (3), and from the linear bound on A2 (say by K) 

V N (t) < V N (0) + -^Y 3 (J* NKW N (U N (s) + V N (s))d s y 

Then, still by the law of the large number there exists a random variable 
C such that 

V N {t) < V N {0) + C [ W N (U N {s) + V N {s))ds, 
Jo 
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and hence 



X?it) < X 2 N (0) + C [ W N (X?( S ) + X 2 N ( S ))ds. 



Jo 



By Gronwall lemma, 



sup X?(t) < (Xf(0) + X 2 W (0)) cxp (C f W N (s)ds 




which is then bounded, uniformly in N. 



For any subdivision of [0,T], = t < ti < • • • < t n = T, 




< Y 3 ([ Nl {x * {a) > 1} \ 3 (X?{s),X?(s))ds 



v Jo 



so by a similar argument as above, we also get the tightness of the BV norm 



as K -> 0, independently in N. Then X^ is tight in L p ([0, T}), for any 1 < p < 
oo (Giusti (1984)). 

1. 4- Identification of the limit 

We choose an adherence value (0,^2 (i)) of the sequence (X^^t), X^^t)) in 
L 1 ([0,T]) x L p ([0,T]). Then a subsequence (again denoted by) (X?(t),X£(t)) 
converge to (0,X 2 (i)), almost surely and for almost t £ [0,T]. We are looking- 
for test-functions such that 



is a martingale and Bjy/(X^ v (s), X 2 N (s)) is bounded independently of N when 
X\ > 1. The following choice is inspired by Crudu et al. (2012). We introduce 
the stochastic process Y t x,v , starting at y and whose generator is 



for any x > 1. and we introduce the semigroup defined on bounded function, 
for any x > 1, by 



H\\X?\\[o,t]>K}^0 




A x g(y) = X 2 (x,y) g(y+l)-g(y) 



P?g(y)=E\g(Yr)e-ti^ Y *' V) 



(8) 
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Then the semigroup Pf satisfies the equation 

dP?g(y) 



dt 



A x P^g{y)- 11 {x,y)P?g{y). 



Now for any bounded function g, define recursively 

f(0,y) = g(y), 

poo 

f(x,y) = / Pf( 7 i(x,.)/(x-l,.))(y)*. 



Such a test function is well defined by the assumption on 71. We then verify 
that 

B w /(0, y) = Ax(0, y) ( J P/( 7 i(l, -)g(-)){y)dt - g{y)) + 72(0, y) [g(y - 1) - g(y) 
Bjv/(x,y) = M(x,y) f(x + l,y)-f(x,y) +j 2 (x,y) f(x,y-l)-f(x,y) 
Indeed, for any x > 1, 

^/(z,?/) -7i(^2/)/(^2/) 

A K Pf ( 7l (x, .)f(x - 1, .))(») - y)P t x (li(x, .)f(x - 1, .))(y)dt 
d 



dt 



Pr(^i(x,.)f(x-l,.))(y)dt, 



lim P t x ( 7 i(a:,.)/(a;,.))(j/) -yi(x,y)f(x- l,y), 
-7i{x,y)f(x - l,y). 



Then 



= 0. 



^(x,y) f(x,y + 1) — f(x, y) +7i(x,y) f(x-l,y)-f(x,y) 
Hence Mjsrf{x,y) is independent of N, and, taking the limit TV — > 00 in 



we deduce 



5 (X 2 (i))- 5 (X 2 (0))- 
is a martingale where 

Boo<?(y) = A 1 (o,y)( / p/(7i(i,.)5(-))(y)*-.9(y)) +72(0,2/) [ 5 (y-i)-.9(y) 



Uniqueness Due to assumption on k\ and k 2 , the limiting generator defines 
a pure-jump Markov process in N which is not explosive. Uniqueness of the 
martingale then follows classically. 
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2. Continuous-state bursting model 

The model we consider now is a continuous state-space model that explicitly 
assume the production of several molecules instantaneously, through a jump 
Markov process, in agreement with experimental observations (Golding et al. 
(2005); Raj et al. (2006)). In line with experimental observations, it is stan- 
dard to assume a Markovian hypothesis (an exponential waiting time between 
production jumps) and that the jump sizes are exponentially distributed (geo- 
metrically in the discrete case) as well (Suter et al. (2011)). The intensity of the 
jumps can be a linearly bounded function, to allow for self-regulation. 

For simplicity, we will only consider the standard model of gene expression, 
that is with linear degradation rates and the production rate of the second 
variable is linear with respect to the first variable (as in example 1). Note that 
more general rates as in the previous section could be considered as well. Here, 
we ask the question of what is the correct scaling so that the bursting production 
term is transmitted from the first variable to the second one, when the first 
variable is eliminated through an adiabatic limit. The propagation of bursting 
property in a gene network is an important question in molecular biology Kaern 
et al. (2005). 

This section is structured as follows. We first present the model in the rest 
of this paragraph, then state the results in subsection 2.1, and divide the proofs 
in the remaining three subsection 2.2,2.3,2.4. 

Let X\ and x 2 denote the concentrations of mRNA and protein respectively. 
A simple model of single gene expression with bursting transcription is given by 

dx~\ ° 

— = -gix 1 +N(h,\ 1 (x i )), (9) 

= -.922:2 + k 2 xi. (10) 

Here gi and g 2 are the degradation rates for the mRNA and protein respectively, 
k 2 is the mRNA translation rate, and N(h, Ai(x2)) describes the transcription 
that is assumed to be a compound Poisson white noise occurring at a rate Xi(x 2 ) 
with a non-negative jump size Axi distributed with density h. 
The equations (9)-(10) are a short hand notation for 

Xi(t) = xi(0)- / g 1 x 1 (s~)ds (11) 



Jo 

t poo poo 



JO JO 



l {r<x 1 (x 2 (s-))}zN(ds, dz, dr), 



x 2 (t) = x 2 (0) - / g 2 x 2 (s )ds + / k 2 xi(s )ds. (12) 
Jo Jo 

where X s _ = lim t ^ s - X(t), and N(ds,dz,dr) is a Poisson random measure on 
(0, 00) x [0, oo) 2 with intensity dsh(z)dzdr , where s denotes the times of the 
jumps, r is the state-dependency in an acceptance/rejection fashion, and z the 
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jump size. Note that (xi(t)) is a stochastic process with almost surely finite 
variation on any bounded interval (0, T), so that the last integral is well defined 
as a Stieltjes-integral. 

Hypothesis 2. The following discussion is valid for general rate functions Ai 
and density functions h(-) that satisfy 

• Ai G C 1 , Ai is globally lipschitz and linearly bounded with 

< Ai(x) < c + Kx. 

• h G C° and J °° xh(x)dx < oo. 

For such a general density function h, we denote the average burst size by 

xh(x)dx. (13) 

Remark 4. Hill functions are often used to model gene self- regulation, so that 
Ai is given by 

Ai(a;2 = - a 

where L, D are positive parameters and a is a positive integer (see Mackey, 
Tyran-Kaminska and Yvinec (2011) for more details). An exponential distribu- 
tion of the bursting transcription is often used in modeling gene expression, in 
accordance with experimental findings (Xie et al. (2008)), so that the density 
function h is given by 

h(x) = l -e-*'\ 

with b the average burst size. 

If \\{x2) = k\ is independent of the state x%, the average transcription rate 
is bk\ , and the asymptotic average mRNA and protein concentrations are 

arf , :=E[a:i(t-)-oo)l = — . 



x 2 y := E[x2(t — > oo)l 

92 ' 9i92 



2.1. Statement of the results 



Although Equations (9)- (10) are simple, they are not analytically solvable. 
Hence, for pratical use to interpret experimental data, and to avoid numeri- 
cal simulations, one looks for a reduced, analytically solvable, one-dimensional 
equation. In the following discussion, we consider the situation when mRNA 
degradation is a fast process, i.e. g\ is "large enough", but the average equilib- 
rium protein concentration x^ remains unchanged. In most organisms and for 
most genes, mRNA has a smaller lifetime than protein (Schwanhausscr et al. 
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(2011)). In what follows, we denote by g™, k 2 sequences of parameters, A" se- 
quence of functions and h n sequence of density function that will replace gl, 
fc 2 , Ai, h in (11)-(12) and satisfy hypothesis 2. We then denote (x™,^) its 
associated solution. We will always assume one of the following three scaling 
relations: 

(51) Frequent production rate of mRNA, namely gf = ngi, A" = nAi, and 
k 2 = ^2 h n = h are independent of n; 

(52) Large burst of mRNA, namely = ng 1} h n {z) = ±h{Z) and A™ = 
AijAj = ^2 remain unchanged; 

(53) Large production rate of protein, namely g" = ngi, k 2 = nk 2 , and A" = Ax 
h n — h are independent of n; 

These three different scaling are then associated with different behaviors of 
the biological systems given by (xi,x 2 ). As different genes may have different 
kinetics, each one of the possible scaling are of importance (Suter et al. (2011); 
Schwanhausscr ct al. (2011)). 

In this section we determine an effective reduced equation for equation (10) 
for each of the three scaling conditions (S1)-(S3). In particular, we show that 
under assumption (SI), equation (10) can be approximated by the deterministic 
ordinary differential equation 

-JjT = ^.92-^2 + A 2 (x 2 ) (14) 

where 

A 2 (a;2) = bk 2 Xi(x 2 )/gi. 
We further show that under the scaling relations (S2) or (S3), equation (10) can 
be reduced to the stochastic differential equation 

*El = -g 2 Z2 + N(h,\ 1 (x 3 )). (15) 

at 

where ft, is a suitable density function in the jump size Ax 2 (to be detailed 
below). 

We first explain, using some heuristic arguments, the differences between the 
three scaling relations and the associated results. When n — > oo, — > oo and 
applying a standard quasi-cquilibrium assumption we have 

dx\ 



dt "°> 



which yields 



x n i(t) « ±N(h n (.),X?(x2)) = N(g?h n (g?-),\<?(x%)), 
and therefore the second equation (10) becomes 

rtv n h n „ 

« - 9 2X n 2 +%N(h%),XUxm 
at g\ 

« -72^ + ^v(f|^(f|),A?(^) 
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Hence in (15), h(x 2 ) = (fa/ ' g\)~ 1 h((fa/ 'gi)~ 1 x 2 ) under the scaling (S2) and 
(S3). Furthermore, we note that the scaling (S2) also implies nh n (n-) = h(-), 
while in (SI), nh n (n-) = nh(n-) so that the jumps become more frequent and 
smaller. 

We denote (D[0, oo),S) the cadlad function space of function defined on 
[0, oo) at values in M + with the usual Skorohod topology (Jacod and Shiryaev 
(1987)). Similarly (D[0,T],J) is the cadlag funtion space on [0,T], with the 
Jakubowski topology (Jakubowski (1997)). Also, L p [0,T) the space of IP inte- 
grable function on [0, T), with T > 0, which we endowed with total variation 
norm (Giusti (1984)), and M(0,oo) is the space of real measurable function on 
[0,oo) with the metric (Kurtz (1991)) 

/>oo 

d(x, y) = I e~* max{l, | x(t) - y(t) \}dt. 
Jo 

Our main results can be stated as follows 

Theorem 3. Consider the equations (9)-(10) and assume Hypothesis 2. If the 
scaling (SI) is satisfied, i.e., fe™ = nk\, and if x^ (0) — > x\, then when n — > oo, 

1. The stochastic process x™(t) does not converge in any functional sense; 

2. The stochastic process x%(t) converges in law in (D[0, oo),S) towards the 
deterministic solution of the ordinary differential equation 

^ = -.92£2 + A 2 (x 2 ), X 2 (0)=X2, (16) 

where 

A 2 (a; 2 ) = bfa\i(x 2 )/gi. 

Theorem 4. Consider the equations (9)-(10) and assume Hypothesis 2. If the 
scaling (S2) is satisfied, i.e., h n (z) = —h(-), and if xV/(ti) — > x%, then when 
n — > oo, 

1. The stochastic process x% — converges in law in IP , 1 < p < oo and in 
(D[0,T], J) to the (deterministic) fixed value 0; 

2. The stochastic process x 2 (t) converges in law in IP , 1 < p < oo and in 
(D[0,T], J) to the stochastic process defined by the solution of the stochas- 
tic differential equation 

^ = -g 2 x 2 +N(h,X 1 ), x 2 (0) = x° 2 >0, (17) 

where h(x 2 ) = (fa/gi)~ 1 h((fa/g 1 )~ 1 x 2 ). 

Theorem 5. Consider the equations (9)-(10) and assume Hypothesis 2. If the 
scaling (S3) is satisfied, i.e., fc 2 = nfa, and if x 2 (0) — > x 2 , then when n — > oo, 

1. The stochastic process x"(t) converges in law in IP , 1 < p < oo and in 
(D[0,T], J) to the (deterministic) fixed value 0; 



R. Yvinecl / 'adiabatic reduction for PDMP 



15 



2. The stochastic process x%{t) converges in law in L p , 1 < p < oo and 
in (D[0,T], J) to the stochastic process determined by the solution of the 
stochastic differential equation 

^ = -~(2X2 + N(h,ip), x 2 (0) = x° 2 >0, 

where h(x 2 ) = {k 2 /giy 1 h((k 2 /g 1 y 1 x 2 ). 

Remark 5. Note that scalings (S2) and (S3) give similar results for the equation 
governing the protein variable x 2 (t) but very different results for the asymptotic 
stochastic process related to the mRNA. In particular, in Theorem 4, very large 
bursts of mRNA are transmitted to the protein, where in Theorem 5, very rarely 
is mRNA present but when present it is efficiently synthesized into a burst of 
protein. 

In the rest of this paper, we provide proofs of the results mentioned above, 
using martingale techniques. In a companion paper Yvinec et al. (2012), we use 
partial differential techniques to prove similar results (sec also Hascltinc and 
Rawlings (2005); Zcron and Santillan (2010); Santilln and Qian (2011)). 

The proofs of the three theorems above are divided in three steps. In section 
2.2 we first recall generator properties and derive moment estimates associated 
to (9)-(10). In section 2.3 we show the tightness result for all three theorems. 
We then identify the limit using a martingale approach in section 2.4. 

2.2. General properties and moment estimates 

We first summarize the important background results on the stochastic processes 
used in the next. 

One dimensional equation For the one-dimensional stochastic differential 
equation (15) perturbed by a compound Poisson white noise, of intensity A(.) 
and jump size distribution /i(.), the extended generator of the stochastic process 
(x 2 (t)) t >o is (Davis, 1984, Theorem 5.5), for any / G T>{A 00 ), 

Aif(x) = -92X^ + X(x) [J h(z- x)f(z)dz - f(xj) 

V(Ai) = {/ G M(0, oo) : t H> f{xe" l2t ) is absolutely 
continuous for t G 7Z + and 

E 1/02(1*)) - f{x 2 (T-))\ < oo for all t > 0} 

Ti<t 

where 7W(0, oo) denotes a Borcl-measurable function of (0, oo) and the times 
Ti are the instants of the jump of x 2 . It is an extended domain containing 
all functions that are sufficiently smooth along the deterministic trajectories 
between the jumps, and with a bounded total variation induced by the jumps. 
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For any / € V(Ai), we have 

^-Ef(x 2 (t))=EA 1 (f(x 2 (t))). 

Two dimensional equation Consideration the two-dimensional stochastic 
differential equation (9)- (10) perturbed by a compound Poisson white noise, 
of intensity Xi(x2) and jump size distribution h follows along similar lines. Its 
infinitesimal generator and extended domain are 

A 2 g(x 1 ,x 2 ) = -gixi- h (k 2 xi - g 2 x 2 )- — 

ax\ ox 2 

+ \i(x 2 )^J h(z - xi)g{z,x 2 )dz - g{xi,x 2 ) S j 1 (18) 

T>(A 2 ) = {g € 7W((0,oo) 2 ) : 1 1-> g{(j) t {xi,x 2 )) is absolutely (19) 
continuous for t G 1Z + and 

E ^ \g{xi(T i ),x 2 (T i )) -g{ Xl {Tr), X2 (Tr))\ < oo for all t > 0} 

Ti<t 

where <pt is the deterministic flow given by the deterministic part of equations 
(9)-(10), namely 

dx\ 

lit = ~ giXU 

dx2 , u 

—j- = ~g2X 2 + k 2 xi. 

at 

For any / € T>(A 2 ), we have 

^-Ef(x 1 (t),x 2 (t))=EA 2 (f(x 1 (t),x 2 (t))). (20) 
dt 

Using the stochastic differential equations (11)-(12), we can deduce moment 
estimates, needed to be able to use unbounded test function (namely f(x\, x 2 ) = 
X\ and f(x\,x 2 ) = x 2 ) in the martingale formulation. By taking the mean into 
(11)-(12), neglecting negatives values and using hypothesis 2, 



< E[xi(t)] <E[ / b\i(x 2 (s))ds] < / b(c + KE[x 2 (s)])ds 
Jo Jo 

< E[x 2 (t)] < [ k 2 E[ Xl (s)]ds 
Jo 

where we note b = E[/t,] = J Q zh{z)dz. By Gronwall inequalities, there exist a 
constant C such that 

E[ sup xi(t)] < C(E[xi(0)] +e CT ) 

W] (21) 
E[ sup x 2 (t)] < C{E[x 2 (0)] +e CT ) 
te[o,T] 
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Then we claim that f(xi,x 2 ) = Xi is in the domain of the generator A 2 . We 
only have to verify (see Eq (19)) 

E Y \xi{Ti) - x x (Tr)\ < oo for all t > 0. 

Ti<t 

By equation (11) 

ft />oo />00 

E Y \xi(Ti) - x x {T~)\ = E / l {r < Xl{x2(s - m zN(ds,dz,dr), 

T<t Jo Jo Jo 

< bE[ [ c + Kx 2 (s)ds\. 
Jo 

which is finite according to the previous estimates. 
2.3. Tightness 

SI We first show the tightness property for the scaling (SI) corresponding to 
theorem 3. In such case a;™ does no converge in any functional sense because it 
fluctuates very fast, as more and more jumps appears of size that stay of order 
1 (given by h). However, E[a;™(i)] remains bounded, — goes to 0, and by eq. 
(12), 

I a£(t) l<l a£(0) I + / k 2 | | da. 



Jo 

For any n, let N n be the compound Poisson process associated to (11), with 
{T n ,i}iL\ the jump times which occur at a rate n\i(x 2 (s) n ), and {Z rly i}°l 1 the 
jump sizes that are iid random variables with density h (with the convention 
T n ,o = and Z nfl = Xq), 

N n (t)= y 

T„,i<i 

Then 

!?(*)= Y ^ e "" 9l(t " T "' l)l {*>T„, l} - 
T n ,i<t 

By integration, 

f x n 1 {s)ds= Y Zn^ — il-e-^-^l^r^y 
J ° T n , f <t 7191 

Then, 

a#(t) < a£(0) + / k 2 x r l(s)ds <x 2 l (0) + — V Z„ 4 . 

Finally we deduce, by definition of the compound Poisson process, 

aS(t) < x™(0) + —N n (t). 

ngi 
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Now, by a time change, there exists a process Y such that 

N n (t)=Y([ nAi(a£00)ds), 
Jo 

where Y is unit rate compound Poisson process of jump size iid (with density 
h). By the law of large number, ^Y(nt) is almost surely convergent, and hence 
almost surely bounded. We deduce then there exists a random variable C such 
that 

x"(t)<a;™(0) + — C / Ai(a£(s))da. 
3i Jo 

By Gronwall lemma and Markov inequality 

P{ sup a#(i) > M} -> 0, 
te[o,r] 

as M -> oo and uniformly in n. Similarly, for any t\,t2 G [0, T], 

I i5(t 2 ) - ^(to |< A. | iv B ( t2 ) _ iv n (ix) i . 

Again, iV n (i 2 ) — N n (ti) = y( / nAi(:r 2 (s))<is) and, still by the law of large 

Jti 

number ^ 

| ^(t 2 ) - x2(h) \<-C [ 2 X 1 ( 3 q(a))ds, 
9i Jti 

so that , for any e > 

lim limsup sup P{ | x 2 l (S* 2 ) - |> e} = 0, 

n Si<S 2 <Si+9 

where the supremum is over stopping times bounded by T. Then by Aldous' 
tightness criteria ((Jacod and Shiryaev, 1987, thm 4.5 p 356)), £ 2 is tight in 
(£>[(), oo),S). 

S3 Now we show the tightness property for the scaling (S3) corresponding to 
theorem 5, with fc 2 = n/c 2 . In such case a;™ converges to in L , and we get a 
control over n L x™(s)ds. Indeed using c/(xi,x 2 ) = x\ in (18), we get 

- 4(0) - f (-n 9l x^(s) + bXi{x 2 {a) n )da), 
Jo 

is a martingale so that due to Hypothesis 2, 

5iE[n / <(s)ds] <E[x?(0)] +&(ci + if / E[x"(s)] ds) 
Jo Jo 

By eq. (12), 

a#(t) < E[x 2 l (0)] +/c 2 n / x"(s)ds. 

Jo 
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then 

sup ^(i) < E[ij(0)] +k 2 n [ x 7 l(s)ds. 
te[o,T] Jo 

Reporting into the estimates for x™ yelds 

gi E[n f x™(s)ds] < E[s?(0)] + b(ct + K(E[x%(0)] +tk 2 n [ E[x"(s)]ds)), 
Jo Jo 

< C£ + C£E[ra f x^(s)ds], 
Jo 

for two constants C^, Cf. that depends solely on T. Then E[n / *x™(s)ds] is 
bounded uniformly in n so that converges to in L 1 and 

P{ sup a#(i) > M} -> 
te[o,T] 

as A/ — > oo and uniformly in n. Now for any subdivision of [0, T], = to < t\ < 

■ ■ ■ < t n = T, 

n-1 „ t 

V I x n 2 (U +1 ) - x n 2 {U) | < E[x™(0)] + k 2 n / <(s)ds, 

so that we also get the tightness of the BV norm, 

P{KII[o,T] >M} -+0, 

as M — > 0, independently in n. Then x 2 is tight in L p ([0,T}), for any 1 < p < 
oo (Giusti (1984)) and also, by a similar criteria, in (D[0, T], J) (Jakubowski 
(1997)). 

S2 Now we show the tightness property for the scaling (S2) corresponding to 
theorem 4, with h n = —h(-). Remark that on such case, denoting z n = — , the 
variables (z n ,x 2 ) satisfies (11)-(12) with the (S3) scaling, so we already now 
that x 2 is tight in L p ([0,T]), for any 1 < p < oo. 

For x™ , formally, note that each jumps yelds a contribution for J x™ of ^- so 
there's no hope for a convergence to in L 1 . However, we still have 

<(i)= Z n<i e- n ^- T ^ l t > Tn .. 

T n ,i<t 

where T n ,i appears with rate \\{x^ (s)), and {Z n ^}°^ 1 are iid random variables 
with density h n . Then 

a?i(t) < ^( 1 {[t„, i ,t„, i+ -^ } + e"" 91 ^ l t > Tn)i 
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But for M > 0, by Markov inequality, 

P\Z n ,e-^ 31 > M\ < — < e, 

for any e and n sufficiently large. Then, conditionning by the jump times, 
ft i 

/ PK(s) > M | T n>i } < -^Mt>T^}+ e (*- r ».<)l{t>T Bl< }<e. 

T n ,i<t T n ,i<t 

for n large. Because / * £2 ( s )^ s has been shown to be bounded independently 
of n, we can drop the conditionning, and J* P{x"(s) > M} is arbitrary small. 
We show also similarly that 

lim sup / max(l , | x"(i + /i) - x"(i) = 0, 

n Jo 

so that x\ is tight in M(0,oo) ((Kurtz, 1991, thm 4.1)). 



2-4- Identification with the martingale problem 

The three theorems below can be proved using martingale techniques, with 
similar spirit. For each scaling, the generator A% can be decomposed into a fast 
component, or order n, and a slow component, of order 1. In each case, one need 
to find particular condition to ensure that the fast component vanishes. For the 
scaling (SI), the fast component acts only in the first variable, so ergodicity 
of this component will ensure that it vanishes, as in averaging theorems Kurtz 
(1992). For the other two scaling, the fast component acts on both variables, and 
we will have to find the particular relation between both variable that ensures 
this component vanishes. 

Proof of Theorem 3 For any B 6 B(R+), t > 0, we define the occupation 
measure ^ 

V?(Bx[0,t])= [ l {B} (x?(s))ds, 
Jo 

and we identify V™ as a stochastic process with value in the space of finite 
mcaurc on K+. Because E[a;f(i)] remains bounded uniformly in n on any [0, T], 
it is stochastically bounded and V\ then satisfies Aldous criteria of tightness. 
Now take a test function / that depends only on X\, so that 

A2f(xi) = nC x J( Xl ), 

with 

C X2 f{xi) = -gixif(xi) + Xi(x 2 ) I J h(z - x\)f(z)dz - f(xi) 
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Then 



M t " = /(z?(t)) - /(x?(0)) - n / / C xUs) f{ Xl )V^{d Xl x ds) 



is a martingale. Dividing by n, for any limiting point (l/i,x 2 ), we must have, 
for any / e C 6 (R+), 

E [ / / Cx 2 ( 8 )/(a=i)^i(^i x ds)] = 0. 



Because for any X2, the generator C K2 is (exponentially) ergodic, V\ is uniquely 
determined by the invariant measure associated to C X2 ■ In particular, for any 
t > 

* f* b 

xiV™(dxi x ds) — s- / — Ai(x 2 (s))ds, 
i + o u Jo 5i 

Then for / that depends only on £2, 

/(a£(t)) - /(xj(0)) - / / t (fc 2 x 1 - 52 ^( S ))/'(4 l ( S ))n n (^i x ^) 



converges to 



/(x 2 (i))-/(x 2 (0))- / (^(x^s))-^*))/'^))^ 

Due to the assumption on Ai, there exists a unique solution associated to the 
(deterministic) equation 14 so X2 is uniquely determined. 

Proof of Theorem 5 We've already seen that x™ converges to in L 1 ([0, T]) 
and x 2 is tight in L p ([0,T]). We then take a subsequence (x"(f), x 2 (t)) that 
converges to (0, x 2 (i)), almost surely and for almost t £ [0, T]. Then we consider 
the fast component of the generator A% , given in this case by 

df , df 

-gixi- 1- k 2 xi- — . 

axi 0x2 

This defines a transport equation. Starting at (xi,X2) at time 0, the asymptotic 
value of the flow associated to the transport equation is (0, y) where 



K2Z «2 

y — x 2 + / k2Xi(s)ds — x 2 + / dz = x 2 H X\ 



9\z 9i 



Wc then consider ^ 

/(xi,x 2 ) = g(x 2 + — xi), 
9i 

that satisfies, for any xi,x 2 , 
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Now taking the limit n — > oo into 

/(*?(<), - /(x?(0), zj(0)) - ^S/(X?(«), 

Jo 

yelds 

g(x 2 (t)) - 5(2:2(0)) - / -92x29' (x 2 (s)) 
Jo 

+ Ai(x 2 (s)) / /i(2)5(x 2 (s) + - g(x 2 (s)) )ds, 



where /i(x 2 ) = (fc 2 /gfi) 1 /i((fc 2 /f/i) 1 x 2 ). Hence the limiting process x 2 must 
satisfy the martingale problem associated with the generator 

A^gix) = ~92X^ +Xi(x)(^J h{z- x)f(z)dz - /(x)) , 

for which uniqueness holds for bounded k\ (see (Crudu et al., 2012, thm 2.5)). 
A truncature argument allows then to conclude for linearly bounded k\ . 

Proof of Theorem 4 As noticed before, (z™,x 2 ) with z n (t) = Xl i fy satisfies 
the scaling (S3) so similar conclusion holds for x 2 . 
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